write chunks in TextParserStreamer #3025

pavel-esir · 2025-11-17T11:58:56Z

Description

Return DeltaMessage in incremental TextStreamerParser instead of accumulated msg. But accumulated is still available via get_parsed_message()

CVS-CVS-176146

Fixes #(issue)

Checklist:

Tests have been updated or added to cover the new code.
This patch fully addresses the ticket.
I have made corresponding changes to the documentation. TODO

Copilot

Pull Request Overview

This PR refactors the TextParserStreamer to return incremental delta messages in the write() callback instead of accumulated messages, while still maintaining accumulated message access via get_parsed_message(). The key changes update the reasoning parser implementation to properly handle delta messages and modify the concatenation logic in the streamer.

Modified TextParserStreamer to write delta messages instead of accumulated messages
Updated ReasoningIncrementalParser to populate message fields for incremental output
Added concatenate_json_containers() helper function for message accumulation

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
tests/python_tests/test_parsers.py	Updated tests to use `get_parsed_message()` for assertions and removed manual message accumulation in custom streamers
src/cpp/src/text_streamer.cpp	Added `concatenate_json_containers()` function and modified `write()` to return delta messages while maintaining internal accumulation
src/cpp/src/parsers.cpp	Refactored `ReasoningIncrementalParser` to populate `content` field in delta messages and removed `keep_original_content` logic from helper methods

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/cpp/src/text_streamer.cpp

src/cpp/src/parsers.cpp

Copilot

Pull Request Overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

tests/python_tests/test_parsers.py:1

Assignment to reason_str loses previously accumulated reasoning content. The variable should be appended to, not replaced, to preserve partial reasoning text from previous chunks.

# Copyright (C) 2023-2025 Intel Corporation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/python_tests/test_parsers.py

src/cpp/src/text_streamer.cpp

src/cpp/src/parsers.cpp

Copilot

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/cpp/src/parsers.cpp

src/cpp/src/text_streamer.cpp

src/cpp/src/parsers.cpp

src/cpp/src/text_streamer.cpp

src/cpp/src/parsers.cpp

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/python_tests/test_parsers.py

tests/cpp/parser.cpp

src/cpp/src/parsers.cpp

src/cpp/src/json_container.cpp

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/cpp/src/parsers.cpp

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/python_tests/test_parsers.py

src/cpp/src/parsers.cpp

Co-authored-by: Copilot <[email protected]>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/cpp/src/parsers.cpp

Co-authored-by: Copilot <[email protected]>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/python_tests/test_parsers.py

src/cpp/src/text_streamer.cpp

src/cpp/src/json_container.cpp

Co-authored-by: Copilot <[email protected]>

Copilot

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-28T11:39:20Z

src/cpp/src/parsers.cpp

        // Extract reasoning content between tags
-        message["reasoning_content"] = std::string(txt_chunk.substr(open_idx + m_open_tag.size(), 
-                                                                    close_idx - (open_idx + m_open_tag.size())));
+        message["reasoning_content"] = std::string(txt_chunk.substr(open_idx + m_open_tag.size(), close_idx - (open_idx + m_open_tag.size())));


[nitpick] The calculation close_idx - (open_idx + m_open_tag.size()) is duplicated from the removed line. Consider extracting this to a variable for clarity, e.g., size_t reasoning_length = close_idx - (open_idx + m_open_tag.size());

Suggested change

message["reasoning_content"] = std::string(txt_chunk.substr(open_idx + m_open_tag.size(), close_idx - (open_idx + m_open_tag.size())));

size_t reasoning_length = close_idx - (open_idx + m_open_tag.size());

message["reasoning_content"] = std::string(txt_chunk.substr(open_idx + m_open_tag.size(), reasoning_length));

Copilot · 2025-11-28T11:39:21Z

src/cpp/src/parsers.cpp

+            // Intentionally clear delta_text: no delta content is returned to the user during this phase
+            // (we are waiting for the <think> tag to be fully detected in the cache).


[nitpick] The comment above this line describes why delta_text is cleared, but it should be updated to reflect that this intentional clearing is part of the caching strategy while waiting for the full <think> tag.

Suggested change

// Intentionally clear delta_text: no delta content is returned to the user during this phase

// (we are waiting for the <think> tag to be fully detected in the cache).

// Intentionally clear delta_text as part of the caching strategy:

// no delta content is returned to the user during this phase because we are

// accumulating partial data in m_text_cache until the full <think> tag is detected.

pavel-esir · 2025-11-28T11:53:39Z

src/cpp/include/openvino/genai/parsers.hpp

     * @param message JsonContainer to store parsed results and reasoning metadata
     * @param delta_text New text chunk to be processed in this step
     * @param delta_tokens Optional vector of new token IDs to be processed
     * @return std::string Filtered text with reasoning content processed according to configuration


Update doc. Replace message -> delta_message in @param message JsonContainer to store parsed results and reasoning metadata

Copilot AI review requested due to automatic review settings November 17, 2025 11:58

github-actions bot added no-match-files category: GGUF GGUF file reader category: text streamer labels Nov 17, 2025

pavel-esir requested a review from apaniukov November 17, 2025 11:59

Copilot AI reviewed Nov 17, 2025

View reviewed changes

src/cpp/src/text_streamer.cpp Outdated Show resolved Hide resolved

src/cpp/src/text_streamer.cpp Outdated Show resolved Hide resolved

src/cpp/src/parsers.cpp Outdated Show resolved Hide resolved

src/cpp/src/parsers.cpp Outdated Show resolved Hide resolved

write chunks in TextParserStreamer

2be242f

pavel-esir force-pushed the write_chunks_during_parse branch from 9b7cd5c to a41f0d0 Compare November 21, 2025 12:08

Copilot AI review requested due to automatic review settings November 21, 2025 12:08

Copilot AI reviewed Nov 21, 2025

View reviewed changes

pavel-esir force-pushed the write_chunks_during_parse branch 2 times, most recently from d71fc3f to 4017fb5 Compare November 21, 2025 12:23

Copilot AI review requested due to automatic review settings November 21, 2025 12:23

Copilot AI reviewed Nov 21, 2025

View reviewed changes

src/cpp/src/parsers.cpp Show resolved Hide resolved

src/cpp/src/parsers.cpp Show resolved Hide resolved

Fix pytests and gtests for parsers

4017fb5

apaniukov reviewed Nov 24, 2025

View reviewed changes

src/cpp/src/text_streamer.cpp Outdated Show resolved Hide resolved

apaniukov reviewed Nov 24, 2025

View reviewed changes

src/cpp/src/parsers.cpp Outdated Show resolved Hide resolved

apaniukov reviewed Nov 25, 2025

View reviewed changes

src/cpp/src/text_streamer.cpp Outdated Show resolved Hide resolved

apaniukov reviewed Nov 25, 2025

View reviewed changes

src/cpp/src/parsers.cpp Outdated Show resolved Hide resolved

get rid of 'ensure_message_fields'; extend texts

6e5dcc2

Copilot AI review requested due to automatic review settings November 27, 2025 23:31

github-actions bot added the category: CPP API Changes in GenAI C++ public headers label Nov 27, 2025